Microaggregation Sorting Framework for K-Anonymity Statistical Disclosure Control in Cloud Computing
نویسندگان
چکیده
In cloud computing, there have led to an increase in the capability to store and record personal data (microdata) in the cloud. In most cases, data providers have no/little control that has led to concern that the personal data may be beached. Microaggregation techniques seek to protect microdata in such a way that data can be published and mined without providing any private information that can be linked to specific individuals. An optimal microaggregation method must minimize the information loss resulting from this replacement process. The challenge is how to minimize the information loss during the microaggregation process. This paper presents a sorting framework for Statistical Disclosure Control (SDC) to protect microdata in cloud computing. It consists of two stages. In the first stage, an algorithm sorts all records in a data set in a particular way to ensure that during microaggregation very dissimilar observations are never entered into the same cluster. In the second stage a microaggregation method is used to create k-anonymous clusters while minimizing the information loss. The performance of the proposed techniques is compared against the most recent microaggregation methods. Experimental results using benchmark datasets show that the proposed algorithms perform significantly better than existing associate techniques in the literature.
منابع مشابه
Improved Univariate Microaggregation for Integer Values
Privacy issues during data publishing is an increasing concern of involved entities. The problem is addressed in the field of statistical disclosure control with the aim of producing protected datasets that are also useful for interested end users such as government agencies and research communities. The problem of producing useful protected datasets is addressed in multiple computational priva...
متن کاملNew Multi-dimensional Sorting Based K-Anonymity Microaggregation for Statistical Disclosure Control
In recent years, there has been an alarming increase of online identity theft and attacks using personally identifiable information. The goal of privacy preservation is to de-associate individuals from sensitive or microdata information. Microaggregation techniques seeks to protect microdata in such a way that can be published and mined without providing any private information that can be link...
متن کاملK−means Clustering Microaggregation for Statistical Disclosure Control
This paper presents a K-means clustering technique that satisfies the biobjective function to minimize the information loss and maintain k-anonymity. The proposed technique starts with one cluster and subsequently partitions the dataset into two or more clusters such that the total information loss across all clusters is the least, while satisfying the k-anonymity requirement. The structure of ...
متن کاملA polynomial-time approximation to optimal multivariate microaggregation
Microaggregation is a family of methods for statistical disclosure control (SDC) of microdata (records on individuals and/or companies), that is, for masking microdata so that they can be released without disclosing private information on the underlying individuals. Microaggregation techniques are currently being used by many statistical agencies. The principle of microaggregation is to group o...
متن کاملAn approximate microaggregation approach for microdata protection
Microdata protection is a hot topic in the field of Statistical Disclosure Control, which has gained special interest after the disclosure of 658000 queries by the America Online (AOL) search engine in August 2006. Many algorithms, methods and properties have been proposed to deal with microdata disclosure. One of the emerging concepts in microdata protection is k-anonymity, introduced by Samar...
متن کامل